x <-c(14,12,6) ### vectorn <-c('Tottenham','Aston Villa','Brentford') ### vector of namesnames(x) <- n ### assigning namesx
Tottenham Aston Villa Brentford
14 12 6
Basic data structures (4)
Matrices
In R, a matrix is a collection of similar data types arranged in a two-dimensional rectangular layout. They are usually created with the matrix() function:
matrix(data =c(1,2,3,5,8,13), ### the data elements (First Fibonacci numbers)ncol =3, ### number of columnsnrow =2, ### number of rowsbyrow =TRUE) ### fill matrix by rows
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 5 8 13
Basic data structures (5)
Named matrices
As for named vectors, named matrices can contain labels to be attached to rows and/or columns:
### Generating a named matrixM <-matrix(data =c(1,2,3,5,8,13), ### the data elements (First Fibonacci numbers)ncol =3, ### number of columnsnrow =2, ### number of rowsbyrow =TRUE) ### fill matrix by rowsrn <-c('r1','r2') ### vector of rownamescn <-c('c1','c2','c3') ### vector of colnamesrownames(M) <- rn ### assign rownamescolnames(M) <- cn ### assign colnamesM
c1 c2 c3
r1 1 2 3
r2 5 8 13
Basic data structures (6)
Lists
A collection of objects (numbers, vectors, matrices, etc.). Lists are the most general and flexible elements in R because they can contain elements of any type (including other lists).
New data frames are usually created with the data.frame() function.
Beware: data.frame()’s default behaviour turns strings into factors
Standard reaction
Little detour: factors
Factors - Definition
They are used to represent categorical data and can be either ordinal (e.g., company hierarchies) or non-ordinal (e.g., hair color).
A factor MUST be imagined as a vector of integers, where each integer is associated with a label.
Factors: example
Let’s try to create our first factor:
x <-factor(c("yes", "yes", "no"))x
[1] yes yes no
Levels: no yes
The order in which the levels are represented can be modified using the levels argument of the factor function. By default, the levels are ordered alphabetically.
Additionally, if the levels have a hierarchy (e.g., soldier, lieutenant, marshal, etc.), we can indicate this by specifying ordered = TRUE in the factor function.
Factors: example
Given a factor, we can use the table function to obtain a table with the levels and frequencies of the variable.
x <-factor(c("A", "B", "A", "B"))x # printing the factor
[1] A B A B
Levels: A B
str(x) # structure of the factor
Factor w/ 2 levels "A","B": 1 2 1 2
table(x) # table with levels and frequencies
x
A B
2 2
Factors: Warning
Never forget that a factor is nothing more than an integer associated with a label.
x <-factor("a")y <-factor("b")
c(x, y)
[1] a b
Levels: a b
A very convenient package to work with factors is forcats
Basic data structures (11)
Data frames
To avoid the problem of converting strings into factors, use stringsAsFactors = FALSE when creating data frames